Statistics of Cycles: How Loopy is your Network? 
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We study the distribution of cycles of length h in large networks (of size N ^> 1) and find it to be 
an excellent ergodic estimator, even in the extreme inhomogeneous case of scale-free networks. The 
distribution is sharply peaked around a characteristic cycle length, ft, ~ TV". Our results suggest 
that h t and the exponent a might usefully characterize broad families of networks. In addition to 
an exact counting of cycles in hierarchical nets, we present a Monte-Carlo sampling algorithm for 
approximately locating h, and reliably determining a. Our empirical results indicate that for small 
random scale-free nets of degree exponent A, a = 1/(A — 1), and a grows as the nets become larger. 
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Recently, there has been much interest in large net- 
works arising in a natural or social context (the Inter- 
net and the World Wide Web, networks of social con- 
tacts, networks of predator-prey, of flight connections, 
the power grid, etc.) |lj, LJ Lj]. Initially, such networks 
were believed to be modeled by Erdos-Renyi (ER) ran- 
dom graphs Q — graphs obtained by realizing only a 
fraction p of the ^TV(TV — 1) links that could potentially 
form between the TV nodes present. Subsequently, Watts 
and Strogatz demonstrated that the neighbors of a node, 
in most of the networks in question, tend to be connected 
to one another as well. This effect of clustering, absent in 
ER graphs, is neatly captured in their Small World net- 
work model (HQ. Then, Barabasi et al., 0,0 observed 
that the degree k of nodes (number of links connected 
to a node) in realistic networks follows a power-law, or 
scale-free distribution: P(k) ~ fc~ A . The scale- free prop- 
erty gives rise to exotic behavior of the networks, such 
as resilience to random dilution (the percolation transi- 
tion does not take place for A < 3), on the one hand, 
and high vulnerability to removal of the most connected 
nodes, on the other hand, and has become a principal 
focus of attention 0. 

The importance ascribed to scale-free degree distribu- 
tions often obscures the relevance of other attributes. 
The question is whether there exist other global char- 
acteristics of nets, beside their degree distribution, that 
are relevant to their performance (stability, ease of trans- 
port, searchability). Here we propose that the statistics 
of cycles seems particularly promising in this respect. 
Cycles are relevant to propagation along the net, and 
their statistics exhibits a high degree of ergodicity (the 
results do not vary much from one node to the next). We 
find that, in the thermodynamic limit of very large nets 
(TV 1) the distribution of cycles of length h is sharply 
peaked around a characteristic cycle length h* ~ N a . 
Thus the distribution can be characterized by a single 
figure of merit — the exponent a, which we refer to as 
the loopiness exponent. Generically a < 1, but we shall 
see that for many well known examples a — 1, while for 
small random scale-free nets of degree exponent A our 
preliminary results suggest that a = 1/(A — 1), and a 



grows as the nets become larger. 

The question of ergodicity is particularly difficult in 
scale-free graphs. The highly connected nodes are re- 
sponsible for many of the special properties attributed to 
these nets (lack of a percolation transition, rapid trans- 
port), yet the lower-degree nodes account for most of 
the nets' mass. This skewness makes it a challenge to 
identify properties representative of the net as a whole. 
Consider, for example, the clustering index, defined as 
d = Eil\ki(ki — Y) [5j (ki is the node's degree, or number 
of neighbors, and is the number of edges connecting 
between those neighbors). The overall clustering index, 
C = (Ci), averaged over all the nodes of the net, is a 
commonly cited statistics: In some scale- free networks C 
can be orders of magnitude larger than the corresponding 
ER graphs (ER graphs with the same numbers of nodes 
and links) However, the clustering index of highly 
connected nodes tends to be quite smaller than for nodes 
of small degree, and this variation is overlooked in the 
global average. 

The ergodicity problem is solved in the statistics of cy- 
cles in the following sense. An /i-cycle is a closed path 
through h connected links that is self-avoiding (does not 
revisit nodes, other than the first) 0. Define the global 
statistics JV/, as the total number of distinct /i-cycles 
in the graph (cyclic permutations of the nodes do not 

count). The local counterpart, iVj[ , is the number of 
/i-cycles that pass through node i. We argue that in 
scale-free graphs, it is likely that any cycle of moderate 
length h will pass through the most connected node, so 

(i) 

the difference between Nh and could be quite small, 
making the Nh a good global statistics. 

Consider the deterministic scale-free graph of 
Fig. n [nj. Each successive generation is obtained 
from the previous one by connecting a new node to 
both endpoints of every existing link. Alternatively, the 
(ri + l)-th generation could be constructed by adjoining 
three copies of generation n at the hubs — the most 
connected nodes (denoted by A, B, C, in Fig. [TJ. It can 
be shown that the degree of the nodes is distributed 
in scale- free fashion, with A = 1 + In 3/ In 2 10]. The 
recursive nature of this graph allows us to obtain the 
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FIG. 1: Recursive scale- free graph, with A = 1 + In 3/ In 2. 
Generation n + 1 is obtained by joining three replicas at the 
hubs (most connected nodes) A, B, and C. The closely related 
Sierpinski Gasket calls for joining the replicas at the vertices, 
1, 2, and 3. 



exact statistics of cycles. Let Nh(ri) and Lh(n) be the 
number of /i-cycles, and the number of self-avoiding 
paths of length h connecting between two hubs (A and 
B), in graphs of generation n, respectively. Then, 

N h {n+ 1) = 3N h (n) +Y^L hl (n)L h2 (n)L h3 {n) , (1) 



L h (n + 1) = L h (n) + y^ y L hl (n)L h2 (n) , 



(2) 



Nh given by these relations is plotted in Fig. |2 Simi- 
lar relations hold for the number of /i-cycles that pass 
through a hub. The two statistics become virtually iden- 
tical beyond a small threshold h, confirming that does 
indeed constitute a good global estimator (Fig.[3J). 
Evident in Fig. is the scaling property 



lnN h = Kf(—) 
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(3) 



where the scaling function f(x) is well-approximated by 
a parabola about its peak at x = 1 (or h = h*) 
Thus, the distribution of cycles, expressed in terms of 
the scaling parameter x — h/h r , is nearly a Gaussian of 
width 1 / \fhl , and converges to a delta function in the 
thermodynamic limit of h„ — > oo, or A" — > oo (Fig. |2 
inset). It follows that in the thermodynamic limit the 
statistics of cycles is characterized by a single parameter, 
/i*, or better yet, by the way this quantity depends on 
the size of the net. For the network of Fig.^ we have 
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(since the number of nodes in the net is N(n) = (3™ + 
3)/2, and h„{n) ~ 2"). We have tested other recursive 
nets and found various model-dependent exponents a < 
1. An interesting question which we examine below is 
what is a for random scale-free nets. 

We now examine the statistics of cycles in other well 
known networks. In a regular square lattice of VN x V^V" 
sites, Jensen and Guttmann find a similar scaling to that 
suggested by the example above, with ss 0.8A r , or a = 



InN, 
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FIG. 2: Statistics of cycles for the network of Fig. Shown 
are In Nh for generations n — 3, 4, 5, 6, 7. Inset: The probabil- 
ity distribution for cycles of length x — h/h* tends to a delta 
function as n —* oo (N — > oo). 




FIG. 3: The number of /i-cycles that hit a hub, N h A \ com- 
pared to the global statistics of Nh- Shown are results for 
generation n — 7, superposed upon a plot of P(h). Note the 
perfect agreement of the two statistics, where P(h) is signifi- 
cant. 



1 It is also interesting to compare to the statistics 
of cycles in the Sierpinski Gasket, a fractal lattice which 
is closely related to the scale-free net of Fig. For a 
given generation, the two graphs have the same number 
of nodes and links, but the Sierpinski Gasket constitutes 
almost a regular graph, where all nodes other than the 
three vertices share the same constant degree, k = 4. 
Here too, exploiting the recursive nature of the lattice, 
and following an exact counting procedure [l3| . we find 
the same kind of scaling as in Eq. @, but with /i* ~ N, 
or a = 1. 

Next, consider a complete graph of order N, K^. 
Starting from an arbitrary node, the next node in the 
cycle can be chosen from any of the remaining A^ — 1 
nodes, etc., yielding 
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2h(N-h)\ 



for complete graph. 



(5) 



The additional factor of l/(2h) corrects for overcounting: 
it does not matter where a cycle starts (h possibilities) 
and whether one traces it clockwise or counterclockwise. 
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At any rate, it follows that h* ~ N — 1 (N 3> 1), and 
once again a = 1. For the case of ER graphs, with only a 
fraction p of the links realized, we cannot offer an exact 
expression but instead make the following approxima- 
tion: Each link in an /i-cycle is present with probability 
p, and so the whole cycle exists with probability p h . We 
then ignore the correlation between cycles (due to the 
fact that different cycles might share a subset of links) 
and write 

N\ 

N h * 2h{N _ h y P , for ER graphs. (6) 

This expression is of course exact in the limit of p — ► 1, 
and it correctly predicts the breakdown of cycles at the 
percolation transition threshold of p c = l/(N—l). When 
p is fixed and N — > oo, we find once again /i» « N, sug- 
gesting a = 1 . If the N — > oo limit is approached along 
with p = u)/N — > 0, so as to keep next to the percolation 
transition, Eq. JHJ predicts h* w (1 — ui~ 1 )N, and still 
a = 1. However, the shortcomings of the approximation 
involved make this last result rather questionable. 

It would seem that in most cases the self-avoiding cy- 
cles are space filling (a = 1), yet it is not clear whether 
this is true for ER graphs near the percolation transi- 
tion. At any rate, several recursive scale-free nets ex- 
hibit a < 1. In order to study this and similar issues, we 
resort to a simple Monte-Carlo procedure for sampling 
the enormous number of cycles (of all sizes) that arise in 
various nets. 

We first prune the net from all 'dangling ends': nodes 
of degree k = 1, and the link leading to the node, are 
removed from the net. This action is reiterated until all 
extant nodes are of degree k > 2. To find the frequency of 
cycles, we perform a self-avoiding random walk, starting 
from a randomly selected node. Each step is chosen ran- 
domly between all the possibilities that would not result 
in self- intersection (other than with the starting node). 
The walk is terminated when it comes back to the start- 
ing node, and the number of steps, h, is recorded. A 
cycle produced in this way is a biased representative of 
the subset of /i-cycles, because the excluded-volume con- 
straint (the restriction of no self-intersection) is not uni- 
form along the walk, becoming more severe as the walk 
progresses. However, we know that this effect can be ne- 
glected in regular lattices of dimension d > 4 [l4|. We 
argue that the effect is likewise minimal in the environ- 
ment of large, multiply connected networks. 

The frequency of /i-cycles found out in this way is un- 
derestimated. Consider an arbitrary /i-cycle already lain 
on the net. Suppose that we are on node i on the cycle 
and we take a step, choosing randomly from the ki — 1 
links that would not force us back through the link lead- 
ing to i. The probability of hitting the next node on the 
cycle is — 1). The probability of finding that partic- 
ular cycle is then proportional to IIjl/(fci — 1), where the 
product is taken over the nodes of the cycle. Thus, to bet- 
ter represent the true frequency of cycles, a cycle found 
by the self-avoiding random walk procedure is counted 




h 

FIG. 4: Sampled frequency of cycles, P(h)' , compared to their 
true frequency (o), P(h), in a small ER net of 18 nodes. Inset: 
The ratio P(h)' / P(h) increases exponentially with h. 
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FIG. 5: Statistics of cycles in random scale-free graphs with 
A = 3. Results from Monte-Carlo counting are shown for nets 
of size N = 100, 200, 400, 800. Inset: Scaling of h* with N (o) 
is consistent with h* ~ N a , a = 1/(A — 1) = 0.5 (solid line) 
only for N small. 



Hi(ki — 1) times. This factor is actually too large, because 
some of the ki — 1 links, when followed from node i, lead 
only to dead ends (paths that self-intersect before com- 
pleting the cycle). The net effect is that the frequency 
of cycles of length h is overestimated by an exponential 
factor ~ e ch . This is illustrated in Fig.|H where we com- 
pare the sampled frequency of cycles, P(h)', to their true 
frequency, P(h), in a small ER graph of 18 nodes. (The 
true frequency of cycles in such small nets can be counted 
by properly adapted depth-first-search algorithms.) 

As a result of the exponential overestimate, the most 
likely cycle is sampled at an apparent location: h'^ — 
h* + cer 2 /2, where a is the width of the distribution 
P(h). In all cases examined above, a 2 ~ N a , so that 
h'^ ~ N a and the sampling procedure yields the correct 
exponent a. (However, if a 2 ~ N 13 , /3 > a, our sam- 
pling procedure would find the exponent (3 rather than 
a.) Indeed, when applied to the recursive nets of Fig. ^ 
the sampling algorithm finds w 1.08/i„, and correctly 
predicts a — 0.63 ± 0.02 (compare with the exact result, 
a = In 2/ In 3 w 0.6309). 

We have applied the Monte-Carlo sampling to random 
scale- free graphs of degree exponent A — 3. Our results, 
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FIG. 6: Typical random scale-free net of A = 3 and N = 100. 
The net separates into small, loopless components (shown on 
the right) and a giant component with loops involving mostly 
the first shell of nodes attached to the hub. 

presented in Fig. [5] are consistent with the relation a = 
1/(A— 1) when the nets are small (N < 200), but a grows 
as N increases. This can be understood in the following 
way. For N w 100 and A = 3 we find that most cycles 
are formed between the hub and nodes in the first shell 
(nodes connected to the hub by one link), see FigEJ The 
few nodes in the second shell (two links away from the 
hub) that form part of cycles almost never connect to one 
another. The likely cycle length is then proportional to 
the number of nodes in the first shell, or to the degree 
of the hub, K ~ iV 1 /^" 1 ) HQ!. Th us, for small N the 
loop exponent is a — 1/(A— 1). As N grows larger, nodes 
in higher shells form part of interconnected cycles and a 



increases. 

Cycles have been studied before, and several inter- 
esting results were obtained for cycles of small length, 
h < N 0, [ll, [l7| ■ Our study indicates that the full dis- 
tribution of cycles, of all possible lengths, displays addi- 
tional useful properties: Ergodicity is implied in the fact 
that the distribution of cycles that pass through a node is 
similar for most nodes of the net, even in the extreme in- 
homogeneous case of scale-free networks. For large nets, 
the distribution resembles a delta function that peaks 
about a typical cycle size, h* ~ N a . The exponent a 
serves as a single figure of merit that characterizes the 
"loopiness" of the net in question; the larger a the more 
loopy the net. a — 1 for regular lattices and fractals, and 
for complete graphs. For small, random scale-free nets 
a = 1/(A— 1), but a increases as the nets become larger. 
It remains an open question whether the loopy exponent 
saturates at a — 1 as N — > oo. 

Among the many remaining open questions, finding 
reliable and efficient algorithms for sampling the distri- 
bution of cycles is perhaps the most important one. Only 
when these become available will we be able to study the 
full statistics of cycles in the truly large nets that have 
been the focus of so much recent attention. 
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